Search CORE

51 research outputs found

Learning Stable Classifiers by Transferring Unstable Features

Author: Bao Yujia
Barzilay Regina
Chang Shiyu
Publication venue
Publication date: 14/10/2021
Field of study

While unbiased machine learning models are essential for many applications, bias is a human-defined concept that can vary across tasks. Given only input-label pairs, algorithms may lack sufficient information to distinguish stable (causal) features from unstable (spurious) features. However, related tasks often share similar biases -- an observation we may leverage to develop stable classifiers in the transfer setting. In this work, we explicitly inform the target classifier about unstable features in the source tasks. Specifically, we derive a representation that encodes the unstable features by contrasting different data environments in the source task. We achieve robustness by clustering data of the target task according to this representation and minimizing the worst-case risk across these clusters. We evaluate our method on both text and image classifications. Empirical results demonstrate that our algorithm is able to maintain robustness on the target task, outperforming the best baseline by 22.9% in absolute accuracy across 12 transfer settings. Our code is available at https://github.com/YujiaBao/Tofu

arXiv.org e-Print Archive

Channel Vision Transformers: An Image Is Worth C x 16 x 16 Words

Author: Bao Yujia
Karaletsos Theofanis
Sivanandan Srinivasan
Publication venue
Publication date: 13/10/2023
Field of study

Vision Transformer (ViT) has emerged as a powerful architecture in the realm of modern computer vision. However, its application in certain imaging fields, such as microscopy and satellite imaging, presents unique challenges. In these domains, images often contain multiple channels, each carrying semantically distinct and independent information. Furthermore, the model must demonstrate robustness to sparsity in input channels, as they may not be densely available during training or testing. In this paper, we propose a modification to the ViT architecture that enhances reasoning across the input channels and introduce Hierarchical Channel Sampling (HCS) as an additional regularization technique to ensure robustness when only partial channels are presented during test time. Our proposed model, ChannelViT, constructs patch tokens independently from each input channel and utilizes a learnable channel embedding that is added to the patch tokens, similar to positional embeddings. We evaluate the performance of ChannelViT on ImageNet, JUMP-CP (microscopy cell imaging), and So2Sat (satellite imaging). Our results show that ChannelViT outperforms ViT on classification tasks and generalizes well, even when a subset of input channels is used during testing. Across our experiments, HCS proves to be a powerful regularizer, independent of the architecture employed, suggesting itself as a straightforward technique for robust ViT training. Lastly, we find that ChannelViT generalizes effectively even when there is limited access to all channels during training, highlighting its potential for multi-channel imaging under real-world conditions with sparse sensors. Our code is available at https://github.com/insitro/ChannelViT

arXiv.org e-Print Archive

Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes

Author: Acevedo Francisco
Armengol Victor Diego
Bao Yujia
Barzilay Regina
Braun Danielle
Deng Zhengyi
Hughes Kevin S
Kim Heeyoon
Ouardaoui Nofal
Parmigiani Giovanni
Wang Cathy
Wang Yan
Publication venue
Publication date: 24/04/2019
Field of study

PURPOSE: The medical literature relevant to germline genetics is growing exponentially. Clinicians need tools monitoring and prioritizing the literature to understand the clinical implications of the pathogenic genetic variants. We developed and evaluated two machine learning models to classify abstracts as relevant to the penetrance (risk of cancer for germline mutation carriers) or prevalence of germline genetic mutations. METHODS: We conducted literature searches in PubMed and retrieved paper titles and abstracts to create an annotated dataset for training and evaluating the two machine learning classification models. Our first model is a support vector machine (SVM) which learns a linear decision rule based on the bag-of-ngrams representation of each title and abstract. Our second model is a convolutional neural network (CNN) which learns a complex nonlinear decision rule based on the raw title and abstract. We evaluated the performance of the two models on the classification of papers as relevant to penetrance or prevalence. RESULTS: For penetrance classification, we annotated 3740 paper titles and abstracts and used 60% for training the model, 20% for tuning the model, and 20% for evaluating the model. The SVM model achieves 89.53% accuracy (percentage of papers that were correctly classified) while the CNN model achieves 88.95 % accuracy. For prevalence classification, we annotated 3753 paper titles and abstracts. The SVM model achieves 89.14% accuracy while the CNN model achieves 89.13 % accuracy. CONCLUSION: Our models achieve high accuracy in classifying abstracts as relevant to penetrance or prevalence. By facilitating literature review, this tool could help clinicians and researchers keep abreast of the burgeoning knowledge of gene-cancer associations and keep the knowledge bases for clinical decision support tools up to date

arXiv.org e-Print Archive

DSpace@MIT

Investigation of hot spring gas components and soil gas fluxes in Arxan Holocene volcanic field, Inner Mongolia, NE China

Author: Baoxiao Bao
Di Han
Di Han
Guohui Gu
Guohui Gu
Sheng Guan
Sheng Guan
Xiaodong Pan
Xiaodong Pan
Yujia Song
Yujia Song
Publication venue: 'Frontiers Media SA'
Publication date: 01/05/2023
Field of study

The latest research results show that there is a unified magma system and heating channel beneath the Arxan volcanic field, indicating a potential risk of eruption. The Arxan volcanic field features multiple gas emission sites (e.g., Jinjianggou hot springs and Yinjianggou hot springs) and exhibits strong hydrothermal activity. In this study, measurements of the hot spring gas composition and soil CO2 flux in the Arxan Holocene volcanic field were conducted, and the results were combined with previous research results to analyze the degassing characteristics of this region. The results show that the volcanic gases in the Arxan volcanic field are composed of 0.07%–1.09% CO2, 0.33–12 ppm CH4, 1.57–53 ppm H2, 800–30,241 ppm He, and 1.14%–1.86% Ar. The He content in this area is notably higher than that in other dormant volcanoes in China. This difference is possibly caused by U–Th decay in the Mesozoic granodiorite and acidic volcanic rocks in the study area, which can produce substantial radiogenic He. The soil gas concentrations near the Jinjianggou and Yinjianggou hot springs are higher than those of two Holocene volcanoes. The peak CO2 concentration in the soil near the Jinjianggou hot spring can reach 35,161 ppm. The single-site soil microseepage CO2 flux in the Arxan volcanic field is 4.66–107.18 g m−2 d−1, and the estimated annual CO2 emission flux from the volcanic field to the atmosphere is 0.63 × 105 t, which also demonstrates that soil CO2 flux of Arxan volcano is comparable to the soil CO2 emission level of the Iwojima volcano

Directory of Open Access Journals

Ivermectin induces apoptosis of esophageal squamous cell carcinoma via mitochondrial pathway

Author: Bao Dengke
Han Jingru
Li Yanming
Li Yujia
Liu Junqi
Lu Mengmeng
Si Jiaoyang
Wang Jiaxin
Wei Xiajie
Xu Nana
Yang Hushan
Yang Xiaotian
Yao Xiaojuan
Zhang Juanmei
Publication venue: Jefferson Digital Commons
Publication date: 07/12/2021
Field of study

Background: Esophageal squamous cell carcinoma (ESCC) is the most predominant primary malignant tumor among worldwide, especially in China. To date, the successful treatment remains a mainly clinical challenge, it is imperative to develop successful therapeutic agents. Methods: The anti-proliferative effect of ivermectin on ESCC is investigated in cell model and in nude mice model. Cell apoptosis was assessed using flow cytometry, TUNEL assay and western blotting. Mitochondrial dysfunction was determined by reactive oxygen species accumulation, mitochondrial membrane potential and ATP levels. Results: Our results determined that ivermectin significantly inhibited the proliferation of ESCC cells in vitro and in vivo. Furthermore, we found that ivermectin markedly mediated mitochondrial dysfunction and induced apoptosis of ESCC cells, which indicated the anti-proliferative effect of ivermectin on ESCC cells was implicated in mitochondrial apoptotic pathway. Mechanistically, ivermectin significantly triggered ROS accumulation and inhibited the activation of NF-κB signaling pathway and increased the ratio of Bax/Bcl-2. Conclusions: These finding indicated that ivermectin has significant anti-tumour potential for ESSC and may be a potential therapeutic candidate against ESCC

Jefferson Digital Commons

Increased Drp1 promotes autophagy and ESCC progression by mtDNA stress mediated cGAS-STING pathway.

Author: Bao Dengke
Chen Hui
Li Yujia
Liu Hongliang
Liu Junqi
Niu Menglan
Wan Lixin
Wan Shaogui
Wang Jiaxin
Wang Yanming
Wu Yuanyuan
Yang Hushan
Yang Qi
Yang Yating
Zhao Jing
Publication venue: Jefferson Digital Commons
Publication date: 24/02/2022
Field of study

Background: Mitochondrial dynamics homeostasis is important for cell metabolism, growth, proliferation, and immune responses. The critical GTPase for mitochondrial fission, Drp1 is frequently upregulated in many cancers and is closely implicated in tumorigenesis. However, the mechanism underling Drp1 to influence tumor progression is largely unknown, especially in esophageal squamous cell carcinoma (ESCC). Methods: Immunohistochemistry was used to examine Drp1 and LC3B expression in tissues of ESCC patients. Autophagic vesicles were investigated by transmission electron microscopy. Fluorescent LC3B puncta and mitochondrial nucleoid were observed by fluorescent and confocal microscopy. Mitochondrial function was evaluated by mitochondrial membrane potential, ROS and ATP levels. Xenograft tumor model was performed in BALB/c nude mice to analyze the role of Drp1 on ESCC progression. Results: We found that Drp1 high expression is correlated with poor overall survival of ESCC patients. Drp1 overexpression promotes cell proliferation and xenograft ESCC tumor growth by triggering autophagy. Furthermore, we demonstrated that Drp1 overexpression disturbs mitochondrial function and subsequent induces mitochondrial DNA (mtDNA) released into the cytosol thereby inducing cytosolic mtDNA stress. Mechanistically, cytosolic mtDNA activates the cGAS-STING pathway and facilitates autophagy, which promotes ESCC cancer growth. Moreover, mtDNA digestion with DNase I and autophagy inhibition with chloroquine attenuates the cGAS-STING pathway activation and ESCC cancer growth. Conclusions: Our finding reveals that Drp1 overexpression induces mitochondrial dysfunction and cytosolic mtDNA stress, which subsequently activates the cGAS-STING pathway, triggers autophagy and promotes ESCC progression

PubMed Central

Jefferson Digital Commons

Deriving machine attention from human rationales

Author: Bao Yujia.
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2019
Field of study

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019Cataloged from PDF version of thesis.Includes bibliographical references (pages 47-50).Attention-based models are successful when trained on large amounts of data. In this thesis, we demonstrate that even in the low-resource scenario, attention can be learned effectively. To this end, we start with discrete human-annotated rationales and map them into continuous attention. Our central hypothesis is that this mapping is general across domains, and thus can be transferred from resource-rich domains to low-resource ones. Our model jointly learns a domain-invariant representation and induces the desired mapping between rationales and attention. Our empirical results validate this hypothesis and show that our approach delivers significant gains over state-of-the-art baselines, yielding over 15% average error reduction on benchmark datasets. Our code and data are available at https: //github. com/YujiaBao/R2A.by Yujia Bao.S.M.S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienc

DSpace@MIT

Efficient and Robust Algorithms for Practical Machine Learning

Author: Bao Yujia
Publication venue: Massachusetts Institute of Technology
Publication date: 21/06/2022
Field of study

Machine learning models are biased when trained on biased datasets. Many recent approaches have been proposed to mitigate biases when they are identified a priori. However in real-world applications, annotating biases is not only time-consuming but also challenging. This thesis considers three different scenarios and presents novel algorithms for learning robust models. These algorithms are efficient as they do not require explicit annotations of the biases, enabling practical machine learning. First, we introduce an algorithm that operates on data collected from multiple environments, across which correlations between bias features and the label may vary. We show that when using a classifier trained on one environment to make predictions on examples from a different environment, its mistakes are informative of the hidden biases. We then leverages these mistakes to create groups of examples whose interpolation yields a distribution with only stable correlations. Our algorithm achieves the new state-of-the-art on four text and image classification tasks. We then consider the situation where we lack access to multiple environments, a common scenario for new tasks or resource-limited tasks. We show that in real-world applications related tasks often share similar biases. Based on this observation, we propose an algorithm that infers bias features from a resource-rich source task and transfers this knowledge to the target task. Compared to 15 baselines across five datasets, our method consistently delivers significant performance gain. Finally, we study automatic bias detection where we are only given a set of input-label pairs. Our algorithm learns to split the dataset so that classifiers trained on the training split cannot generalize to the testing split. The performance gap provides a proxy for measuring the degree of bias in the learned features and can therefore be used to identify unknown biases. Experiments on six NLP and vision tasks demonstrate that our method is able to genreate spurious splits that correlate with human-identified biases.Ph.D

DSpace@MIT